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Abstract  and  Claims. 

This  essay  tries  to  expound  a  conception  cif  interval  measures  that  permits  a  particular 
approach  to  partial  ignorance  decision  problems,  t  he  \  irtue  of  this  approach  for  artificial 
reasoning  systems  is  that  the  following  guestums  become  moot;  1.  which  secondary  criterion  to 
apply  after  maximizing  expected  utility,  and  1.  how  much  indeterminacy  to  represent.  The  cost  of 
the  approach  is  the  need  for  explicit  epistemological  foundations:  for  instance,  a  rule  of  acceptance 
with  a  parameter  that  allows  various  attitudes  toward  error.  Note  that  epistemological  foundations 
are  already  desirable  for  independent  reasons. 

The  development  is  as  follows:  1.  probability  intervals  are  useful  and  natural  in  A.I.  svstems; 
2.  wide  intervals  avoid  error,  but  are  useless  in  some  risk-sensitive  decision-making;  J.  yet  one 
may  obtain  narrower,  or  otherwise  decisive  intervals  with  a  more  relaxed  attitude  toward  error;  4. 
if  bodies  of  knowledge  can  be  ordered  by  their  attitude  to  error,  one  should  perform  the  decision 
analysis  with  the  acceptable  body  of  knowledge  that  allows  the  least  error,  of  those  that  are  usellil. 
The  resulting  behavior  differs  from  that  of  a  Bayesian  probabilist  because  in  the  proposal.  5. 
intervals  based  on  successive  bodies  of  knowledge  are  not  always  nested:  6.  the  use  of  a  probability 
for  a  particular  decision  does  not  require  commitment  to  the  probability  for  credence:  and  7.  there 
may  be  no  acceptable  body  of  knowledge  that  is  useful;  hence,  sometimes  no  decision  is  mandated. 


I.  Interval  Measures. 

By  now.  the  use  of  an  interval  measure  is 
regarded  highly  for  probability  judgements  in 
reasoning  systems.  Researchers  selecting  formalisms 
for  quantifying  belief  have  all  recognized  the  virtues  of 
(partial)^  indeterminacy  in  probability  judgement 
([Bar«l|.  (GLFSll,  (Dil821.  [l.ow82|.  (WeH8:|.  (Oui83|. 
(Wes83I.  (Gin84|.  ([.uS84|,  (Str84|.  etc). 

Intervals  allow  varying  degrees  of  commitment 
in  probability  assertion.  At  the  extremes.  7^  t)  =  [0. 

If  is  uncommitted,  while  P(  -I)  =  (.76.  .’'61'  is 
consummate.  Some  have  argued  that  indeterminacy 
captures  ''pre-systematic  "  notions  of  belief  and 
disbelief  [Sha76).  (Lev 80a|.^  Since  0  <  infPiA)  +■  inf 
A  -  .■!)  <  1.  the  agent  can  assign  zero  belief  to  a 
proposition  even  though  he  is  not  certain  that  it  is 
false.  Indeterminacy  is  useful  to  the  subjectivist  when 
eliciting  bounds  on  probabilities  (especially  from 
equivocating  experts),  and  to  the  empiricist  tor 
expressing  the  Neyman-Pearson  confidence  results  of 
population  sampling. 

fntervalism  is  also  natural  in  detachment.  When 


0)  =  I  -  then  ^A.  PdQ)  =  (1  -  c  -  6. 1|. 
If  probabilities  are  based  on  direct  inference  From  the 
class  A.  the  probability  of  "Pt  A  Qx"  for  some  x  C  I 
would  be  an  interval,  despite  having  started  with 
probabilities  that  were  points  (see  frsi)83)  (Che83|. 
and  (Nil84|). 

Many  advocates  of  interval  belief  measures  m 
\.l  link  their  arguments  to  Shater  s  interpretation 
{Sha76)  of  Dempster  s  inference  system  [Dem68|. 
Shafer's  theory  is  claimed  to  provide  a  valuable 
representation  of  intervals  (via  mass  functions),  and  a 
simple,  consistent  approach  to  resolving  apparent 
disputes  when  combining  evidence  (via  Dempster  s 
rule).  These  claims  are  evaluated  elsewhere  (K.yb8S). 
[Lev80a|.  {Zad79).  Shafer's  theory  is  not  unique  in  its 
ability  to  cope  with  disagreeing  evidence;  indeed,  a 
sy  stem  of  belief  would  be  impoverished  if  it  made  no 
provisions  for  disagreement  (see  Levi’s  remarks 
[I  .ev80al;  also,  there  are  indeterminate  systems  due  to 
l.evi.  Smith.  Schick.  GikhI.  and  Kyburg).  Further. 
Dempster's  rule  for  combining  ev  idence  is  relatively 
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presumptuous  as  a  form  of  conditionali/ation  (Dem68j, 
(!  ,cv80b|.  [Kyb85|. 

Putting  aside  the  prospects  for  Dempster’s  rule, 
we  arc  left  with  these  indeterminate  probabilities,  and 
with  an  ensuing  decision  problem.  Harnett  [BarSl]  and 
l  .owrance  [l.ow821  have  both  suggested  that  a  research 
goal  should  be  a  fully  developed  decision  theory  based 
on  interval  measures. 

l  uce  and  RaitTa  call  decision  problems  with 
indeterminate  probabilities  "partial  ignorance" 
problems,  and  earlier  work  on  partial  ignorance  is 
discussed  in  [l.ou85).  Wesley,  l.owrance.  and  Garvey 
[WI.G841  offer  a  candidate  theory  that  is  for  use  with 
Shafenan  beliefs  and  that  ignores  risk:  it  has  been 
discussed  elsewhere  [I.ou84]. 

II.  I- .^tjmation  and  Decision. 

With  interval  prohabilities  or  interval  utilties. 
expected  utilities  are  intervals.  If  interval  probabilities 
arc  narrow  (or  otherwise  fortuitous!  there  is  no 
problem;  expected  utility  intervals  can  be  ordered  in 
the  natural  way  (see  below),  and  the  best  act  identified. 
In  a  1 : 1  lottery  that  depends  on  the  outcome  of  a  coin 
toss,  if  P(  heads}  is  (.7,  .8j  the  decision  should  be  clear 
via  the  obvious  ordering;  if  it  is  (.3.  .8).  the  decision 
may  not  be  clear.  The  decision  may  also  not  be  clear  if 
the  interval  is  narrow,  but  unfortuiious.  e.g..  [.49.  .52|. 

If  the  maximization  of  expected  utility  ( VttL)  is 
the  sole  solution  cnterion.  there  may  be  no  defensible 
ordering  of  the  utility  intervals  that  identities  a  best 
act.  Of  course.  VltX  with  point-probabilities  can  be 
ambiguous  too.  This  latter  ambiguity  is  often 
tolerated;  if  two  acts  have  the  exact  same  expected 
utility .  the  sameness  of  utility  is  supposed  to  reflect 
inditTercnce.  But  ambiguity  with  interval  probabilities 
may  not  be  tolerable  because  intenals  often  model 
Ignorance,  not  indifference.  It  is  not  the  case  that  the 
two  acts  couldn't  be  ordered  in  a  relevant  and 
accounuble  way.  Rather,  not  enough  is  known  to 
order  them. 

There  are  two  problems  here.  Kirst.  there  is  the 
estimation  problem:  what  should  be  the  degree  of 
.ertaintv  attributed  to  a  proposition’  Second,  there  is 


the  decision  problem;  which  act  should  be  chosen 
among  available  acts,  when  the  agent  is  not  indifferent 
about  them  all?  In  the  estimation  problem,  error  is 
avoided  by  using  intervals.  In  the  decision  problem, 
ambiguity  is  avoided  by  eschewing  intervals.  In  order 
to  solve  both  problems  simultaneously  ,  there  must  be 
some  compromise. 


111.  Secondary  Criterion  Solutions. 

l.et  n  be  the  largest  set  of  probability 
distributions  satisfying  all  of  the  interval  constraints. 
Calculating  expected  utilities  in  the  usual  way.  for  act 
in  the  presence  of  unceruintics  h'.: 
u^.(a^)  =  2  {Pi^it:/iit^}U<h  :. 

Vf,; 

LKa^)  =  {  u^la^)  :/’;.€□  [ 

=  [inj  U(a\).  sup 

The  natural  way  to  ( partial  - !  order  acts  with 
indeterminate  utilities  is  by  dominance:  >  a ,  iff  inf 

Uia/)  >  sup  If  there  is  a  unique  maximal 

element  in  the  order,  a*,  then  the  decision  problem  is 
solved.  The  probabilities,  though  indiv  idually 
indeterminate,  are  nevertheless  collectively  decisive. 
But  in  general,  there  wjj)  be  some  set  of  maxima,  {a,}. 

Some  authors  ([HurS  11.  (GooSa).  [Kis65|. 
(l.evSObl)  suggest  that  j*can  be  identified  in  the 
maximal  set  by  one  of  the  so-called  weaker  methods: 
maximin.  min-regret.  orlexi-min  methods.  These  are 
the  methods  recommended  for  decision  problems 
under  uncertainty,  and  their  common  character  that  is 
crucial  here  is  that  they  make  no  use  of  probability 
judgement.  Presumably  the  probability  information 
has  been  milked  for  all  it  is  worth,  under  the  primary 
method  of  MEU.  and  secondary  methods  will  finish 
the  job  of  identifying  a*.  Unless  there  is  an  unforsoen 
equality  of  point-valued  utilities,  the  weaker  method 
guarantees  identifying  a  unique  act.  The  weaker 
method  cou/i/have  been  applied  in  the  first  place  but 
for  its  admitted  weakness.  It  is  considered  weak 
precisely  because  it  ignores  probability  judgement.  It 
IS  employed  secondanly  precisely  because  it 
presupposes  that  probabiliiv  judgement  will  be  of  no 
further  use.  which  is  exactly  the  case  imona  the 
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maximal  set  after  the  application  of  MEU. 

Kor  programmed  systems,  there  is  still  the 
problem  of  choosing  one  from  among  the  v  arious 
secondary  criteria.  Clearly  there  are  situations  in 
which  maximin  is  inappropriate,  and  similarly  for  min- 
regret.  for  optimism-pessimism,  etc.  Supervaluations 
would  be  cautious,  but  impotent:  talcing  the  most 
popular  mandate  among  various  criteria  would  be  ad 
hcK'.  One  could  auempt  to  discriminate  those 
situations  in  which  one  method  applies  and  others  d«* 
not.  but  no  such  attempt  has  been  successful. 

IV.  A  Different  Proposal. 

Here  is  an  entirely  different  way  of  solving 
partial  ignorance  problems.  If  MEL  with  the  given 
probability  intervals  is  indecisive.  MEL  can  be 
retained  and  the  probability  intervals  refined.  In  the 
Hayesian  tradition,  refinement  of  intervals  is  done 
subjectively,  with  no  additional  empirical  information. 
In  the  Neyman- Pearson  tradition,  refinement  is  done 
objectively  and  requires  additional  empirical 
informauon.  In  either  case,  refinement  further 
determines  probabilities.  An  automatic  reasoning 
system  may  be  required  to  be  objective,  may  not  have 
recourse  to  additional  information,  and  may  require 
the  preservation  of  indeterminacy.  Fortunately,  it’s 
possible  to  refine  intervals  objectively,  with  no 
additional  empirical  information,  and  without  losing 
the  indeterminacy  of  probabilities.  This  latter 
possibility  IS  presented  more  carefully  in  (I  ou85|. 

I.et  credal  state  be  described  not  by  one  set  of 
feasible  distributions,  fl.  but  by  a  sequence  of  sets 
<nj>.  Each  n  IS  based  on  a  body  of  knowledge 
formed  wuh  some  quantifiable  attitude  toward  error 
(so  there  is  a  companion  sequence.  <Kj>.  where  each 
body  of  knowledge.  K.  has  an  integer  index  and  a  real 
error).  Successive  K  s  are  more  informative,  but 
predecessors  are  less  prone  to  error.  Each 
indeterminate  expected  utility  calculation  is  done  with 
respect  to  one  element  in  the  fl -sequence,  but  the 
whole  n -sequence  constitutes  the  credal  state. 

I  hercfore.  different  indeterminate  probabilities,  with 
difTcrent  maximal  sets,  can  be  consulted  for  the 


purposes  of  decision  without  changing  the 
indeterminacy  of  the  credal  state. 

Ihis  representation  finesses  the  question  of  how 
narrow  intervals  ought  to  be.  Imagine  the  expert  who 
first  reports  the  interval  is  (.3.  .7).  but  can  be  coaxed 
into  reporting  to  the  more  useful  [.35,  .65).  Which 
interval  gets  represented?  In  this  proposal,  both 
should  be  represented.  Intervals  should  be  as  narrow 
as  permitted  given  the  magnitude  of  error  (u) 
associated  with  the  body  of  knowledge  on  which  the 
intervals  are  based.  fhev  will  be  [0.  Ij  in  ITq. 
fhey  may  be  degenerate  in  the  very  late  fTs.  And  they 
should  be  variously  narrow  (though  not  necessarily 
nested)  in  between. 

In  practice,  this  proposal  requires  additional 
represented  information,  or  additional  inference  rules 
and  epistemological  assumptions.  It  may  be  possible 
simply  to  assert  and  to  represent  both  sequences.  <ni> 
and<Ki>.  But  more  likely,  fl's  will  have  to  be 
generated  from  K’s.  and  successive  K  s  from  some 
initial  base.  Kjj^jj.  A  combination  of  the  two  metliods. 
generation  and  assertion,  is  convenient. 

Generating  fTs  from  K's  requires  the  adoption 
of  some  theory  of  probability.  It  could  be  as  simple  as 
taking  statements  in  K  to  be  consuaints  on 
distributions,  or  conditionali7ing  some  prior  on  the 
contents  of  K.  or  it  could  be  some  theory  of  frequency- 
based  or  chance-based  direct  inference. 

Generating  iwrt  K)  amounts  to  making 
additional  assumpuons.  It  could  be  done  in  a  number 
of  ways:  one  possibilitv  is  to  use  an  acceptance  rule 
isee  also  below,  on  "higher-order"  probabilities).  Such 
a  rule  would  describe  when  a  statement  is  acceptable 
and  would  thus  determine  to  which  K  s  it  belongs.  If 
the  rule  is  based  on  probabilities  relative  to  Kj^jj.  for 
instance,  then  belongs  to  all  those  successor  K  s.  Kj, 
such  that  I  -  /I  I  Kj„j[)  is  less  than  the  error 
associated  with  Kj.  .V  different  probabilistic  rule 
would  take  jwcrtK)  to  be  K  U  { -<[,  where  A  is  the  next 
most  probable  statement  relative  to  K.  of  statements  of 
some  special  form.  Note  that  with  these  rules,  the  K- 
sequcnce  is  nested.  Acceptance  rules  in  the  literature 
arc  more  elaborate.  See  |Kyb"0|  for  additional 
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acceptance  rules  and  their  evaluation. 

Decision-making  amounts  to  exploring  the  fl- 
sequcnce  in  best-first  order  until  cither  ( 1)  the  maximal 
set  under  fl  is  a  singleton,  in  which  case  the  problem  is 
solved;  or  (2)  FI  is  a  singleton,  which  leaves  the 
standard  decision  problem  under  risk;  or  (3)  the  error 
associated  with  fl  is  intolerably  large  lor  this  decision 
problem,  in  which  case  MKU  with  no  acceptable  set  of 
assumptions  can  legislate  unambiguously. 

Ihe  reasonableness  of  this  proposal  depends  on 
whether  there  is  independent  reason  to  use  intervals  of 
a  particular  width.  It  may  be  that  epistemological 
considerations  require  that  certain  intervals  be  used; 
e.g..  the  narrowest  intervals  at  .95  confidence  (KvbSS]. 
But  if  not.  if  confidence  levels  .94  and  .%  arc  also 
useable,  then  decision  analysis  might  as  well  priKeed 
with  intervals  that  are  decisive,  rather  than  with  those 
that  are  indecisive.  There  is  no  reason  to  avoid 
tolerable  error  if  doing  so  results  in  uninformative 

s 

analvsis.-  If  the  MKL  calculation  is  not  satisfactorv 
under  the  as.sumptions  held,  it  could  be  that  the  agent 
has  not  assumed  enough.  The  analysis  should  then  be 
founded  on  an  augmented  set  of  assumptions. 

Conversely,  there  is  no  reason  to  invite  error  in 
the  analysis  if  the  analysis  is  already  sufficiently 
informativ  e.  So  of  the  many  fl’s  that  are  decisive,  the 
one  that  is  least  prone  to  error  has  epistemic  pnority. 
The  augmented  set  of  assumptions  should  be  the  next- 
least  in  order  of  presumptiveness.  No  more 
assumptions  should  be  made  than  are  necessary  for 
decision. 

Consider  the  claim  that  rational  commitment 
ceases  with  the  rcstricuon  to  the  maximal  set.  or  that 
the  agent  must  sometimes  suspend  judgement  w  hen 
the  set  of  maxima  is  not  a  singleton.  Lopes,  voicing  a 
common  intuition,  quips  that  suspending  judgement 
among  choices  with  overlapping  expected  utiiity 
ranges  is  no  more  defensible  than  suspending 
judgement  among  choices  with  overlapping  .lutcome 
ranges  (I.op83|.  Lopes'  remark  is  forceful  precisely 
because  it  points  out  the  arbitrariness  of  interval  width. 

hy  invite  error  by  using  intervals  narrewor  than  (0 
ip  FJccause(0.  l|  intervals  arc  not  vitisfictoriiy 


informative.  But  if  the  maximal  set  under  narrower 
intervals  is  unsatisfactory,  and  the  limit  of  tolerable 
potential  error  has  not  been  reached,  why  not  use  still 
narrower  intervals?  One  is  already  willing  to  forego 
certainty,  and  the  amount  of  certainty  one  is  willing  to 
forego  depends  on  the  other  desiderata,  including 
decisiveness. 

We  still  avoid  error  by  using  indeterminacy  ;  we 
retain  the  early  elements  in  the  <nj>  sequence,  rather 
than  settling  immediately  on  the  must  speciili.  eioment 
(or  of  some  P.  s.t.  P  €  flj  for  all  i).  There  may  not  bo  a 
most  specific  clement  in  the  sequence  (this  is  explored 
in  example C.  below).  \nd  there  may  be  genuine 
instances  in  which  no  substantiable  set  of  assumptions 
legislates  a  unique  decision  or  formulates  a  standard 
risk  problem.  In  such  cases,  indeterminacy  is  required 
to  indicate  ignorance,  or  if  either  is  possible,  the  need 
for  more  sampling,  or  for  suspension  of  judgement. 

For  example,  consider  a  probabilistic  acceptance 
rule;  sutements  are  accepted  in  when  their 
probability  relative  to  exceeds  I  -  a.  Fora 
decision  problem  where  the  maximum  ratio  of  odds  is 
w.  it  would  be  pointless  to  peifcrm  an  MFL  analysis  in 
some  Kjjj)  where  a  >  I  -  w.  If  the  lottery  pays  20;  1. 
w  =  .95.  If  all  ri's  based  on  less  error  than  .05  are 
indecisive,  no  decision  is  legislated  (see  (Kyb85|  and 
[Lou85J  for  discussion). 

V.  F.xamples  and  Contrasts. 

We  discuss  the  following  decision  problem."' 
Upon  finding  a  berry  ,  the  agent  has  to  decide  whether 
to  eat  It  (u/).  or  not  to  eat  it  ud-  If  it  is  eaten,  it 
matters  whether  or  not  it  was  a  good  berry  ( (7).  If  it  is 
not  eaten,  it  matters  whether  or  not  the  agent  later  gets 
hungry!//).  Letu«t7/.  (7>)  =  10;  ~(/>)  = 

-30;  u(<(;i.  //>)  =  -  10;  and  u(<ai.  ~//>)  =  0. 

\ .  .ower  le\ei  contldence  imervais. 

Suppose  the  probability  reports  for  (/  and  for  H 
ire  based  on  Clopper- Pearson  intervals.  Of  4  berries 
eaten.  4  were  good.  On  14  excursions  of  this  kind,  the 
igent  got  hungry  (without  eaung)  3  times.  .Vi  .99 
.;onridence.  h(i)  =  (.35.  Ijand  PiH)  =  (0.  .55|.  So 
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idoj)  =  (-  16.8.  10|;  u(a2)  =  [-5.5.0].  The  maximal 
set  is  {a^.  u^].  But  at  the  confidence  level  .75.  PiO)  = 
(.75. 1]  and  P{f/)  =  [AS.  .Jj.  So  u(af)  =  [0.  lOf:  uiaj) 

=  [-  3.  -  1.5j.  aj>  a-),  aj  is  uniquely  maximal.  Note 
that  d'  a  j  and  o  ?  had  been  ranked  by  utility  midpoints 
at  .99.  mpf  -  -  3.4;  mp2  =  -  2.75;  one  would  have 
concluded  contrarily  that  aj  >  d[\ 

13.  direct  inference  and  probabilistic  acceptance  rule. 

Suppose  ^berries,  good)  =  (.3.  .8}  and 
‘^excursions,  gel- hungry)  =  (0,  Ij  and  %{soft  berries, 
good)  =  (.84.  .88).  Presumably  this  is  accepted  based 
on  sampling  with,  say,  at  least  .999  confidence.  If  Pi  (7) 
is  based  on  the  (.3.  .8]  interval,  both  a  j  and  «i  ,t  are 
maximal.  The  decisive  (.84.  .88]  interval  can  t  be  used 
for  P{G)  unless  it  is  accepted  thdx  the  berry  is  soft. 

F.ven  if  there  is  independent  reason  to  believe  Pithis 
berry  €  soft  berries)  -  .999,  the  probability  of  G  would 
be  (.3.  .8|.  It  s  natural  to  consider  the  acceptance  of 
"this  berry  €  soft  berries".  This  allows  direct  inference: 
P(f73  must  be  (.84.  .88)  if  this  is  all  that  is  known.^ 
l^tsion  to  do  is  based  on  dominance  with  the 
narrower  interval. 

C.  convex  Bayesian  vs.  .Savage's  Bayesian. 

A  Bayesian  who  considers  all  the  distributions  in 
a  closed  convex  set  can  accept  different  constraints  on 
this  set  at  different  levels  of  acceptance  (cf.  (LevSObj). 
Typical  constraints  could  be  conditions  (as  in  example 
B).  or  bounds  on  marginal  probabilities  (as  in  example 
A).  Additional  knowledge  can  lead  to  additional 
constraints,  which  can  decrease  membership  in  n  and 
so  are  more  informative  (though  additional  knowledge 
does  not  always  lead  to  additional  constraints: 
sometimes  it  can  invalidate  a  constraint).  Some 
constraints  may  not  be  as  warranted  as  others,  and 
their  use  introduces  more  possibility  of  error.  If  the  set 
IS  indecisive,  try  the  MEU  analysis  with  the  next  set  of 
constraints. 

Savage  would  have  the  agent  settle  on  the  most 
'pecifk  set  (if  there  is  one),  and  eliminate  the  excess 


indetenninacy  of  the  preceding  sets.  If  all  the  sets  are 
nested  (for  all  i  >  j.  flj  D  rij).  there  is  no  difference 
between  the  decisions  made  by  this  convex  Bayesian 
and  by  Savage's  Bayesian. 

But  sets  are  not  nested.  The  most  obvious 
source  of  non-nesting  is  due  to  conditionalization.^ 
Suppose  n  j  is  based  on  acceptance  so  stringent  that 
probabilities  are  conditional  only  on  4.  n->  takes  both 
I  and  B  as  conditions;  B  is  acceptable  as  a  condition  at 
this  level  (perhaps  B  is  treated  by  Jeffrey  s  rule  in  fl  j ; 

It  doesn't  matter  here).  Then  there's  no  reason  for  Tl-i 
to  be  a  subset  of  flj. 

I  et  I  entail  ~H:  FlGj .-()  =  (.6.  .8|;  PiGl  .(.  B) 

=  (.3.  .4|.  Intuitively.  4  might  be  the  conjunction  "just 
ate  &  the  berry  looks  good'  while  B  might  be  "the 
lighting  IS  misleading” .  flj.  with  PiG)  =  (.6.  .8|. 
indicates  both  a j  (eating)  and  oj  ( not  eating)  as 
maximal.  (!•)  mandates  aj.  with  Pf(7)  =  (.3.  .4],  which 
IS  not  a  sub-interval  of  (.6.  .8).  Now  suppose  the  next 
decision  involving  P(f7)  is  a  1: 1  lottery.  Tl^  mandates 
entering  the  lottery,  and  because  Tl^  is  decisive  and 
epistemically  prior.  112  is  ignored.  Savage  would 
continue  to  use  PiO)  from  [12.  and  would  avoid  the 
lottery. 

So  preservation  of  the  "excess"  indeterminacy  is 
necessary  despite  temporary  refinement  for  the 
purposes  of  the  current  decision. 

D.  Shafenan  discounting. 

It’s  tempting  to  consider  Shafer  s  discounting 
parameter  to  generate  successive  fl's. 

The  belief  with  mass  nuG)  =  .7  and  ni-G)  = 

.3  is  to  be  combined  with  a  belief  =  .6;  mt-G)  = 
.4  based  on  a  new.  independent  source.  The  latter's 
impact  IS  to  be  discounted  by  some  amount  r.  Let  -H 
be  accepted.  If  r<  .23.  then  P{(7)>  .75.  and  u/  =  a*-. 
otherwise  =  a*.  Note  that  for  any  value  of  rhere. 
the  resulting  probability  of  G  is  determinate. 

Are  some  values  of  r  more  cautious  than  others  ’ 
If  r  IS  large,  the  informative  impact  of  the  second  belief 
IS  lessened,  and  it  is  combined  with  caution.  But  a 
cautious  attitude  toward  the  new  belief  is  nut 
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necessarily  a  cautious  attitude  toward  the  possibility  ut' 
error,  unless  the  new  belief  is  the  only  possible  source 
of  error.  When  conditions  were  not  accepted  in 
example  C,  it  was  because  they  were  relatively 
uncertain,  not  because  they  were  new.  Here,  it  may  be 
that  the  full  w  eight  of  the  new  belief  is  required  to 
avoid  error.  It  would  be  erroneous,  for  instance,  to 
ignore  the  new  belief  completely.  Ihe  parameter  r 
here  !■>  being  used  like  Carnap  s  X.  There  is  no 
cpistemic  relation  given  between  rand  error,  hence,  no 
priority  of  one  solution  over  the  other. 

Perhaps  caution  should  be  reflected  by 
discounting  both  belief  functions.  This  begs  the 
question,  in  what  proportion  should  they  be 
discounted?  If  there  are  two  parameters  that  can  he 
varied,  the  fl's  generated  will  be  only  partially 
ordered. 

It  may  be  possible  to  use  Shafer  s  formalism  to 
generate  the  n-scqucnce.  but  its  use  would  require 
more  argument. 

V  I.  Epistemological  Considerations. 

\.  On  Revisions  of  ihe  Knowledge  Base. 

A  behav  ioral  interpretation  of  probability 
suggests  the  identification  of  a*  as  additional  evidence 
about  probability  judgement.  Whatever  the  means  of 
a'  s  identificauon,  there  ts  a  set.  'P.  of  admissible 
probability  distributions,  according  to  each  of  which, 
a*  is  the  unique  maximum  by  MF.U  means  alone. 
Behav  lorists  hold  that  once  j*  is  identified,  the  agent  s 
crcdal  state  contracts  to  the  more  precise  fl ,  the 
intersection  of  'P  and  fl.  at  least  as  a  desenption 
appropriate  at  the  time  of  decision.  Presumably,  if 
there  is  no  subsequent  revision,  the  more  precise 
description  of  past  state  continues  to  describe  the 
current  state.  If  this  is  right,  then  credal  state  depends 
on  the  decisions  made.  Faced  with  a  different  decision 
structure,  a*,  hence  vF,  and  finally  credal  state,  might 
have  been  differenL 

Upon  each  decision,  the  agent  must  be 
consistent,  in  this  behaviorally  strong  sense.  IX'cisions 
tlways  reveal  credal  state  and  always  do  so  through 


MELi. 

1  here  are  enormous  implications  of  this 
rcvclation-through-bchavior  stance  for  the 
management  of  knowledge  bases.  No  matter  how 
tentative  the  decision,  and  whatever  its  content  or 
manner  of  selection,  the  knowledge  base  must 
represent  only  the  distributions  that  are  MHU- 
admissible  for  that  decision.  If  only  a  single 
distribution  is  VlEU-admissible.  then  that  distribution 
specifies  the  new  state  of  the  program  s  belief  .And 
this  has  been  done  with  the  addition  of  no  relevant 
empirical  knowledge!  All  that  distinguishes  the  new 
Slate  from  the  old  is  the  actualization  of  one  particular 
problem  structure,  among  the  many  that  could  have 
been  faced. 

If  the  interpretation  of  probability  is  subjecuve 
as  well  as  behavioral,  the  agent  or  reasoning  system  can 
spuriously  return  to  the  more  permissive  credal  state, 
n.  But  if  this  is  to  be  a  rule  for  rev  ision.  there  seems 
no  point  in  making  the  contraction.  If  it  is  not  a  rule, 
then  there  is  still  the  onerous  possibility  of  spurious 
change  to  some  other  credal  state,  and  worse,  the 
possibility  of  no  change  whatsoever  after  contraction. 

Either  course  violates  legitimate  counierfactual 
intuitions  pertaining  to  the  past  decision.  Suppose  the 
secondary  method  is  always  a  tournament  of  coin- 
flipping.  Upon  the  last  toss  of /iceii/s.  j  >  is  chosen,  and 
ri'  is  obtained  from  fl  by  the  deletion  of  all 
distributions  that  do  nut  mandate  ur  Thus,  it  no 
longer  is  the  case  that  "had  the  toss  been  tails,  u/ 
would  have  been  mandated."  though  we  quite 
reasonably  take  such  to  have  been  the  case. 

Starr  {Sta66l  suggests  a  normative  enterion  for 
identifying  the  optimal  act  when  fl  is  not  a  singleton. 
Suppose  the  distributions  in  fl  can  be  parameterized 
by  some  6.  Suppose  also  that  the  set  of  parameter 
values  0,  corresponding  to  the  fl  distributions,  is 
measured  by  an  additive  indifference  "prior".  So 
subsets  of  n  are  also  measured.  Consider  various  acts. 
An  act  is  mandated  by  each  of  its  VlEU-admissible 
distributions,  which  collectively  form  some  subset  w  C 
n.  Starr  s  criterion  chooses  the  act  with  the  w  that 
maximizes  the  measure  (i.e..  that  has  the  greatest 
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number  of  feasible  admissible  distributions). 

Starr’s  criterion  is  a  prescription  for  decision, 
not  for  the  adoption  of  a  narrower  credal  state. 
Behaviorists  would  contract  to  w. 

Whatever  the  behaviorist  arguments,  the 
revelation  of  credal  state  through  decisions  and  MELi 
is  unattractive  in  A. I.  A  system's  probability  estimates 
are  based  on  objective  analysis  of  samples,  or  on  the 
opinions  of  experts,  not  on  the  future  decision 
problems  to  be  faced  by  the  system. 

B.  On  Higher-Order  Probabilities. 

Some  Baycsians  intuit  the  existence  of  "higher 
order"  probabilities  (e.g..  (Goo83!).  Ihese  would  be 
probability  distributions  on  probability  distributions, 
formalized  perhaps,  like  the  indifference  "prior"  in 
Starr’s  criterion. 

If  one  approves  of  and  has  access  to  such 
measures,  then  acceptance  can  be  based  on  the 
measure.  For  instance,  successive  fl’s  could  be 
generated  by  eliminating  the  next-least  probable 
members  of  the  prev  ious  fl .  Ihis  strategy  leads  to 
nested  fl’s:  all  decisions  would  be  those  mandated  by 
the  disinbution  with  the  greatest  higher-order- 
measure.  It  would  not.  in  general,  be  the  same  as 
taking  an  expectation  over  the  expected  utility 
intervals,  and  ranking  the  resulting  real-values; 

VkVi. 

where  M  is  the  higher-order  measure. 

Perhaps  the  expectation  is  appropnate  if  there  is 
such  a  measure.  However,  one  should  have  misgivings 
about  the  identification  of  these  measures. 

There  may  be  unceruinty  about  the  higher- 
order  measure,  reflected  in  some  sull  higher  measure. 
This  induces  a  hierarchy  of  measures.  Presumably  the 
height  of  the  hierarchy  is  finite.  There  must  be,  at 
some  high  order,  either  a  determinate  measure,  or  else 
unmeasured  indeterminacy.  If  the  former,  then  one 
should  be  suspicious  about  the  source  of  a  determinate 
higher-ordcr-measure;  why  is  the  probability  of  a 
distnbution  certain,  but  the  distnbuuon  unceruin.’ 

The  higher-level  is  not  inherently  more  robust  t  note 


that  the  order  of  the  sums  can  be  reversed).  Just  as  a 
small  error  in  a  probability  can  change  a  decision,  so 
can  a  small  error  in  a  higher-order  probability  change  a 
decision. 

If  on  the  other  hand  there  is  unmeasured 
indeterminacy,  the  expected-expected  utilities  will  be 
intervals.  This  is  essentially  no  different  from  the 
interval  expected  utilities  from  indeterminate  zero- 
order  probabilities. 

So  acceptance  can  be  conceptually  related  to 
higher  order  probabilities,  but  is  not  immediately 
subsumed  or  improved  by  them. 

Vll.  Conclusion. 

A.l.  systems  that  use  interval  judgements  must 
sometimes  solve  partial  ignorance  decision  problems. 
There  are  now  two  approaches.  Maximizing  expected 
utility  can  be  followed  by  maximin.  or  some  other 
secondary  criterion.  Alternatively,  additional 
assumptions  can  be  made  that  change  probabilities, 
temporarily,  so  that  maximizing  expected  utility  is 
sufficient.  This  paper  has  discussed  how  to  implement 
the  latter  approach.  Assumptions  are  accepted  in  an 
order  that  tries  to  avoid  error,  and  they  are  accepted 
only  temporarily,  for  the  purposes  of  decision. 

There  is  still  the  problem  of  choosing  an 
acceptance  rule,  which  iteratively  generates  the  next- 
best  assumption.  This  choice  requires  considerably 
more  epistemological  reflection. 
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VIII.  Notes. 

^  By  iKkurminacy.  we  will  mean  indeterminacy  broadly  ramtiued.  potential  indetermii^.  including 
the  judgemenu  Prob(A)  €  [0. 1]  (complete)  and  Prob(A)  €  (.L  .1]  (d^enerate).  ProtHA)  €  [3.  .7] 
(bounded),  and  [PTOt>(A)  =  .4orProb(A)  =  8]  (di^unctive). 

^  Some  have  charged  that  the  specification  of  an  interval  requires  two  numbers  rather  than  one:  hence,  it 
requites  more  informatioo.  That's  silly  Given  that  some  quantity  p  is  in  ^  0.67.  it  follows  that  p  is  in 
the  interval  (0.34. 0.97].  Furthermore,  in  a  very  natural  canonical  form,  namely,  the  number  of  hyper- 
planar  constraints  required  in  the  space  of  all  probability  distributions:  the  information  (number  of 
constraints)  in  interval  reports  of  a  particular  probability  is  less  than  the  information  in  point  reports. 
Information  measures  are  dependent  on  canonical  form,  hence  can  be  misleading. 

Intervals  are  chosen  because  they  offer  robust  behavior.  If  practice  shows  that  they  are  not  robust 
enough,  that  endpoints  matter  critically,  then  future  investigators  can  feel  free  to  use  a  formalism  with 
indeterminate  upper  and  lower  bounds  or  with  fuzzy  sets,  surely  one  would  not  revert  to  point 
probabilities  because  they  contain  less  infomuuon." 


^  Here,  we've  taken  informativeness  w  r.L  decision  to  be  singulanty  of  FI  or  singularity  of  the  maximal 
set.  Other  interpreutions  of  "informative "  are  possible  (such  as  any  restnction  of  the  maximal  set  to 
decisions  which  cannot  differ  in  outcome  more  than  e  i  fhese  lead  to  different  decision  theones. 

Also  note  that  m  (Lou85|.  the  amount  of  tolerable  error  is  addresed  (see  the  discussion  of  I> 
meaningftil  corpora). 

We  call  this  the  problem  of  Jerry  s  Bernes. 

^  We've  appealed  to  the  epistemological  conception  of  probability  here  If  explicit  statement  of  chances 
IS  required,  the  example  can  be  changed 

^  It's  also  passible  to  violate  nesting  when  constraints  are  ordered  jointly,  and  not  all  constraints  are 
compauble.  So  if  are  constraints  on  fl's.  FT ^  may  be  delimited  by  {cy}.  and  FI2  by  {c^}.  and 

FTj  by  {c^}  before  Flj  by  {C/.  ^2!  Fl^  may  be  delimited  by  {c^.  c^}.  where  {c^.  c>  c^}  isove^ 

determining,  (f  constraints  are  accepted  (rather  than  knowledge  that  generates  constraint),  and 
acceptance  is  purely  probabilistic,  then  this  kind  of  situation  requires  acceptance  levels  at  or  below  5 
With  not  purely  probabilistic  acceptance,  this  situation  is  more  natural. 

Note  that  noo-nested  FI's  would  seem  irrational  via  a  Dutch  Book  argument  but  the  agent  still  posts 
consistent  odds  whenever  he  considers  two  or  more  lotteries  simultaneously.  It's  only  when  he  posts  odds 
independently  and  they  are  subsequently  collected  that  leads  to  inconsistency  Consult  the  FJIsberg 
para^x  for  intuitions  here. 
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Such  a  rule  would  describe  when  a  statement  is  acceptable  and  would  thus  determine  {  use  stiii  narrower  intervais?  One  is  already  willing  to  forego  certainty,  and  the  amount 

to  which  Kj's  it  belongs.  If  the  rule  is  based  on  probabilities  relative  to  Kq,  (or  instance,  .  of  certainty  one  is  willing  to  forego  depends  on  the  other  desiderata,  including 

then  a  statement  A  belongs  to  all  those  successors,  Kj,  such  that  1  -  Prob(A  |  Kq)  is  5  decisiveness, 

less  than  the  threshold  associated  with  Kj.  A  different  probabilistic  rule  would  take 
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than  the  convex  Bayeeian,  but  they  can’t  choose  how  narrow  the  intervals  should  be. 

Shafer  himself  has  formalized  decision-making  in  a  different  way  (I23];  see  also  (28]).  Philosophers  have  studied  other  defects  of  the  behavioral  stance  and  the  Dutch  Book 

arguments  on  which  they  depend  (See  [1 2]  for  a  partial  bibliography).  And  some  have 
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